Search for the standard model Higgs boson decaying to a bb pair in events with one 
charged lepton and large missing transverse energy using the full CDF data set 
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We present a search for the standard model Higgs boson produced in association with a W 
boson in ^/s = 1.96 TeV p-pbar collision data collected with the CDF II detector at the Tevatron 
corresponding to an integrated luminosity of 9.45 fb~^. In events consistent with the decay of the 
Higgs boson to a bottom-quark pair and the W boson to an electron or muon and a neutrino, we set 
95% credibility level upper limits on the WH production cross section times the H ^ bb branching 
ratio as a function of Higgs boson mass. At a Higgs boson mass of 125 GeV/c^ we observe (expect) 
a limit of 4.9 (2.8) times the standard model value. 
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The mechanism of clcctroweak symmetry breaking [ij 
in the standard model (SM) Q predicts the existence of 
a fundamental scalar boson, the Higgs boson. The SM 
does not predict the mass of the Higgs boson, {niH), but 
through the combination of precision electroweak mea- 
surements 3, including recent top quark and W bo- 
son mass measurements from the Tevatron rriH 
is constrained to be less than 152 GeV/c^ at the 95% 
confidence level. Direct searches at LEP2 the Teva- 
tron 0, and the LHC exclude possible masses of the 
SM Higgs boson at the 95% confidence level or the 95% 
credibility level (C.L.), except within the ranges 116.6 - 
119.4 GeV/c^ and 122.1 - 127 GeV/c^. At the LHC ex- 
periments, sensitivity to the Higgs boson primarily comes 
from channels where the Higgs boson decays into two W 
bosons, two photons, or two Z bosons. At the Tevatron, 
searches for a 116-127 GeV/c^ Higgs boson are most sen- 
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sitive to the bb final state, which offer the complementary 
information of the quark Yukawa couplings to the Higgs 
boson. These searches may then be able to establish 
the mechanism of electroweak symmetry breaking as the 
source of fermionic mass in the quark sector. 

In the bb final state, each b quark fragments into a jet of 
hadrons and the Higgs boson signal can be reconstructed 
as an enhancement in the invariant mass distribution of 
these jets. For a pair of jets, the dijet mass resolution 
at CDF is expected to be 10-15% of the pair's mean re- 
constructed mass 0], which is approximately ten times 
larger than the reconstructed mass resolution in the lep- 
tonic or photonic search channels at the LHC. Searches 
for the Higgs boson produced in association with a W 
boson (WH), where the W boson decays into a charged 
lepton (£) and a neutrino (v), provide the most sensi- 
tive search channel at the Tevatron in the mass range 
116-127 GeV/c^, because the requirements of a charged 
lepton candidate and of large missing transverse energy 
Q^t) floi |- consistent with a neutrino escaping detection, 
significantly reduce the backgrounds from multijet pro- 
cesses. Searches for the SM Higgs boson including this 
final state have been reported by the CDF, DO, ATLAS, 
and CMS collaborations [Tll - flit . 

In this Letter we describe a search for the Higgs boson 
in the WH ii^bb channel using the full data set col- 
lected during Run II of the Collider Detector experiment 
at the Fermilab Tevatron (CDF). The CDF experiment is 
a general purpose detector described in Ref. [16[. These 
data correspond to a luminosity of 9.45 fb~^ of pp col- 
lisions. Many aspects of the analysis remain unchanged 
from a recent search based on 7.5 fb~^ and are described 
in more detail in Ref. ■ Events are collected with on- 
line selection criteria (triggers) that require one of the 
following signatures: an electron candidate with trans- 
verse energy exceeding 18 GeV/c [3]; a muon candidate 
with transverse momentum {pt) exceeding 18 GeV/c; or 
^T(cal) > 15 GeV with a forward (I77I > 1.2) electromag- 
netic energy cluster satisfying Et > 20 GeV (designed to 
accept forward electrons from the W boson decay). An 
additional set of triggers is included that does not ex- 
plicitly require an identified lepton, but instead requires 
Mricaj) > 45 GeV or^T(cal) > 35 GeV and a pair of 
jets [ii]. 

The identification of leptons and jets closely follows 
that for the CDF single-top-quark discovery described 
in Ref. Candidate events are selected by requir- 

ing the presence of exactly one lepton candidate with 
Pt > 20 GcV/c. The required is specific to each 
class of reconstructed lepton candidate to satisfy trigger 
requirements and suppress instrumental backgrounds; 
events with an electron satisfying \r]\ < 1.1, electron sat- 
isfying 1 77 1 > 1.1, non-isolated electron~[2lj , muon, or iso- 
lated track are required to have I^t > 20,25,25, 10, or 
20 GeV, respectively. Events are required to have exactly 
two or three jets satisfying \r]\ < 2.0 and Et > 20 GeV 



after corrections for instrumental effects [22| . Events are 
rejected if they are kinematically inconsistent with lep- 
tonic W boson decays as determined by a support vector 
model specific to each lepton category [23| . Each support 
vector model is a binary classifier resulting from super- 
vised training using information about the energies and 
angles of the lepton, jets and missing energy. 

At least one of the jets must be identified (tagged) as 
consistent with the fragmentation of a b quark according 
to a neural network tagging algorithm [2J] . For each jet 
containing at least one charged particle track, the algo- 
rithm produces a scalar value in the range -1 to 1. By 
comparing this value to two predetermined thresholds, 
the jet is classified as not tagged, loose tagged (L), or 
tight tagged (T), with all tight-tagged jets also satisfying 
the loose-tag definition. The thresholds are chosen to 
optimize the combined expected exclusion sensitivity in 
simulated events and the performance of the T and L b 
tag selection is described in Ref. [2j| . The search sample 
is composed of seven orthogonal categories according to 
the exact number and type of b tags in the event: TT, 
TL, T, LL, L for two-jet events, and TT, TL for three-jet 
events. If an event satisfies two categories, the category 
of highest signal purity is chosen. The inclusion of ad- 
ditional 6-tag categories for events with three jets offers 
negligible improvement to the expected sensitivity and 
they are therefore not included. The tagging algorithm 
and strategy employed here is identical to that described 
in the Tevatron combined observation of diboson produc- 
tion with decays to heavy-flavor quarks [isl ]. 

The Higgs boson events are modeled with the 
PYTHIA [26| Monte Carlo event generator combined with 
a detailed simulation of the CDF II detector l27l. [28} and 
tuned to the Tevatron underlying-event data [291 . Small 
corrections to the simulated response of the detector are 
made based on data-simulation comparisons from orthog- 
onal data sets [H, [s^l • Models for background processes 
are derived from a mixture of simulation and data-driven 
techniques [sH . Background processes to WH — )- £vbb in- 
clude W or Z bosons produced in association with jets. 
These processes may include true b jets as in W -I- bb, 
or non-6 jets that have been niisidentified as b jets like 
W + cc, W + cj, and W + jj, where j refers to jets 
not originating from heavy-flavor quarks. Events with 
a top quark {tt and single-top-quark production), dibo- 
son events, and multijet events without W bosons also 
contribute to the sample composition. 

The distributions of the reconstructed dijet invariant 
mass 32, S^l of background and simulated Higgs boson 
events in the categories that contribute most to the sensi- 
tivity are shown in Fig. [TJ with categories of comparable 
signal purity summed together. The two-jet single-loose- 
tagged sample, L, contains twice as many events and has 
ten times smaller signal purity than the other two-jet 
categories combined. This category contributes less than 
1% to the total expected exclusion sensitivity and is not 
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presented in Fig. [T] Event yields are stated as sums of 
categories corresponding to those presented in the fig- 
ure. The total expected signal yield in the current data 
set, assuming tuh = 125 GeV/c^ and based on next-to- 
next-to-leading-order (NNLO) theory predictions for the 
production rate [H, is 12.9 ± 1.1 (12.4 ± 0.9) for the 
TT-I-TL (T-I-LL) categories. Events with exactly three 
jets account for '^10% of the total expected signal yield. 
The background expectation of 1500 ± 400 (6600 ± 1900) 
for TT-I-TL (T-I-LL) events is significantly larger than the 
expected number of signal events. The invariant mass of 
jets is the most discriminating signal variable between 
signal and background, but greater signal significance is 
achieved by using additional kinematic information avail- 
able in each event. 

We employ a Bayesian artificial neural network 
(BNN) [35| trained to discriminate WH — >■ ii^bb sig- 
nal from the background using the information con- 
tained in the following kinematic variables: the invariant 
mass of the candidate Higgs-boson-decay jets [s^; the 
maximum invariant mass of the lepton, I^t, and one of 
the two jets (max(M^j;^_^-^, M^^^^^-^)); the lepton elec- 
tric charge times its pseudorapidity; the scalar sum of 
the lepton and jet transverse momenta minus the B't, 
(Sjcts + Pt^ — $t)\ the scalar sum of the transverse 
energy of calorimeter jets that fail the jet energy selection 
criteria, (X^iow-b^ jats -®t); the absolute value of the trans- 
verse momentum of the reconstructed W boson, recon- 
structed as +St': and the scalar sum of the jet, lepton 
and neutrino transverse energies, (X^jd, + p^+^^t)- 
The BNN combines the discriminating power of these 
variables into a single output variable which, when used 
in searches for a 125 GeV/c^ Higgs boson, is capable 
of excluding cross sections times branching ratios 27% 
lower in the background-only hypothesis as compared to 
searches using the jet invariant mass alone. Improve- 
ments for other mass hypotheses are comparable. We 
validate the predictions of the background model for each 
input variable in data control regions and we optimize 
the discriminants separately for each Higgs boson mass 
hypothesis. The distributions of the BNN outputs of 
the neural network trained for a Higgs boson mass of 
125 GeV/c^ are shown in the right panels of Fig. [TJ Ad- 
ditional sensitivity from the three-jet categories is gained 
by training and employing a BNN to separate top-quark- 
like from VF-|-jets-like events, independently from the 
BNN trained to separate WH events from background. 
In the right panel of Fig. [Ijc), top-quark- like events oc- 
cupy the range of 0-1 of the discriminant, while M^-l-jets- 
like events occupy the range 1-2. 

We calculate a Bayesian C.L. limit for each mass hy- 
pothesis using the combined binned likelihood of the 
BNN output distributions. Each of the seven jet-tagging 
categories are subdivided into four orthogonal lepton cat- 
egories, depending on their distinct instrumental back- 
grounds. After exclusion of two low-signal combina- 



tions [3^1, the analysis comprises 26 independent chan- 
nels that are included in the likelihood. The benefit of 
this subdivision of the search sample is both higher signal 
significance, and the isolation of individual background 
components for systematic constraint. A posterior den- 
sity is obtained by multiplying this likelihood by Gaus- 
sian prior densities for the background normalizations 
and systematic uncertainties leaving axB{H — >■ bb) with 
a uniform prior density, with priors truncated to prevent 
negative predictions. A 95% C.L. limit is determined 
such that 95% of the posterior density for axB{H — > hh) 
accumulates below the limit (STj . 

Systematic uncertainties on the rate of signal and back- 
ground production from jet energy scale, 6-tagging effi- 
ciencies, lepton identification and trigger efficiencies, the 
amount of initial and final state radiation (ISR and FSR), 
and the parton distribution functions are included in 
the limit calculation (ssj . In addition, the limit calcu- 
lation includes shape uncertainties on the discriminant 
output [3^ , arising from uncertainties on the jet energy 
scale, ISR and FSR for all simulated samples, and aris- 
ing from uncertainties on the renormalization and fac- 
torization scale for VF-|-jets samples. The expected ex- 
clusion limits are ~20% tighter if the calculation is per- 
formed without including systematic uncertainties. The 
impact of kinematic differences between simulated and 
data events VF-|-jets is investigated as a potential source 
of systematic uncertainty. The jet energies, angular sep- 
arations, and invariant mass distributions of events se- 
lected prior to b tagging are used to derive shape cor- 
rections which are applied to simulated iy-|-jets events 
in the search samples. These adjustments show negligi- 
ble impact on the discriminant shape of the background 
prediction, and therefore are not considered in the final 
results. 

TablcUand Fig.[2]show the expected and observed lim- 
its calculated for different Higgs boson masses. We find 
an observed (expected) 95% C.L. limit of 4.9 (2.8) times 
the SM prediction of the production cross section times 
branching fraction for a Higgs boson mass of 125 GeV/c^ 
(NNLO theory predicts axB{H bb)= 75 fb) [H. The 
resulting expected exclusion limit is approximately a fac- 
tor of 2.6 lower than our previous Letter [ll|. This im- 
provement in expected sensitivity consists of a factor of 
approximately 1.9 due to the increased data set [i^ and a 
factor of approximately 1.4 due to analysis technique im- 
provements. Increased signal acceptance and background 
rejection gained from the improved 6-tagging algorithm 
provide approximately 11% improvement in exclusion 
sensitivity. The inclusion of three-jet events, increased 
trigger acceptance, improved rejection of multijet events, 
and additional lepton acceptance via new reconstruction 
categories dominate the remaining improvement. The 
two-jet TT and TL categories offer the highest signal pu- 
rity, driving the sensitivity of the analysis. Performing 
the analysis using these two categories alone produces ex- 
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FIG. 1: The distribution for the dijet mass used as an input to the BNN (left), and the BNN output distribution (right). Event 
6-tag categories of comparable signal purity are combined and presented as three orthogonal subsamples: two-jet TT+TL(a), 
two-jet T-|-LL(b), and three-jet TT-|-TL(c). The background is normalized to its prediction and the signal expectation of 
a Higgs boson mass of 125 GeV/c? is scaled to 10, 100, and 100 times the SM prediction in (a), (b), and (c), respectively. 
The right panel of (c) shows the BNN distribution for events with exactly three jets, split into two regions based on an 
independent discriminant designed to separate top-quark-like events (assigned values between zero and one) from W^-|-jets-like 
events (assigned values between one and two). Statistical uncertainties are shown for the data points. 



pected and observed limits comparable to the full analy- 
sis combination, with an observed (expected) limit of 4.8 
(3.2) times the SM (7xB{H bb) for mn = 125 GcV/c^. 
The consistency of the observed limits with the signal 
hypothesis is tested by statistical sampling of the signal- 
plus-background model. These studies indicate that the 
median upper C.L. in the SM Higgs scenario is ~1 unit 
of SM cross-section higher than that for the background- 
only hypothesis over most of the 90-150 GeV/c^ search 
range, which is consistent with the observed limits to 



within one standard deviation. 

In conclusion, we have presented a search for the SM 
Higgs boson produced in association with a W boson 
using the complete CDF Run II data set. This anal- 
ysis employs methods used in CDF analyses of well 
established SM processes, providing confidence in the 
robustness of the background model and search tech- 
niques. The observed exclusion limits exceed those ex- 
pected in the background-only scenario over much of the 
90-150 GeV/c^ search range, with deviations from the 
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FIG. 2: The observed 95% C.L. upper limits on Higgs boson 
production relative to the SM expectation as a function of the 
Higgs boson mass. The median expected limits in the absence 
of a Higgs signal are indicated by the dashed line and the 
light and dark bands indicate the one and two sigma ranges 
for individual experiments in this background-only scenario. 



background-only-hypothcsis corresponding to local sig- 
nificances for tested Higgs boson masses between 120 and 
135 GeV/c^of roughly two sigma. While the LHC ex- 
periments have surpassed the Tevatron experiments in 
overall sensitivity to a SM Higgs boson, the WH 
ivbb search reported here is currently the most sensitive 
single-channel search for a low-mass SM Higgs boson in 
its favored decay mode. 
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